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Abstract 


A method is presented for the visual analysis of 
objects by computer. It is particularly well suited 
for opaque objects with smoothly curved surfaces. The 
method extracts information about the object's surface 
properties, including measures of its specularity, 
texture, and regularity. It also aids in determining 
the object's shape. 


The application of this method to a simple recog- 
nition task -~- the recognition of fruit -- is discussed. 
The results on a more complex smoothly curved object, a 
human face, are also considered. 
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quirements for the degree of Doctor of Philosophy, June 1970. 
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Chapter | The Problem 


Consider the problem of proaramming a computer to 
recognize objects with smoothly curved surfaces, such as 
the object in the photograph of figure |I.Il. Images such 
as these can be digitized by an itmage-dissector camera, 
so that the picture is represented by a raster of 
intensities at closely spaced sample points, represented 
numerically in figure 1.2. We will consider a method of 
processing such input with the ultimate goal of 
recognizing the object in the image. 

There are numerous more or less adeauate known 
techniques for classifying an image once significant 
features have been extracted from it, but the problem of 
extracting such features from the basic optical data is 
fess well understood. The methods which will be 
discussed here are "low-level", in that they manipulate 
actual picture points and try to extract salient 
features, rather than working with hioh-level 
descriptions and attempting to produce an identification. 

It must be recognized, however, that the so- 
called high- and low-level aspects of vision cannot 


really be cleanly separated. There Is no foolproof 


Figure 1.1 


A Simple Smoothly Curved Object 


Sampled Light Intensities from the Apple 
of figure 1.1 


Figure 1.2: 


The intensities in this array have been scaled to be between 


0 and 99 
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completely local way to find features, as there will 
always be ambiguities which can only be resolved through 
the use of context. For example, one must know the light 
intensity (at least roughly) in order to determine 
whether an object is white or black, as a white object in 
very dim light can easily reflect tess light than a black 
object in sunlight. A plum cannot be easily 
distinguished from an Isolated grape, unless the size is 
known. Highlights on a smooth surface cannot be 
understood unless the form of the Tllumination is known. 
The context can of course be determined partly 
from the scene itself. For example, a real scene will 
generally contain surfaces with a wide range of 
reflectivities. This establishes a light intensity 
frame of reference In which the Ilohter objects will 
appear white and the darker ones black, One cannot tell 
the size of a white sphere alone In a photograph, but if 


it fs shown next to a tennis ball, its size is known by 


comparison. (It is possible, but unlikely, that the 
tennis ball Is actually a scaled-up mode! three feet in 
diameter. This usually happens only on movie sets.) In 


a similar manner, the highlight on a known object gives 
information about the lighting which can be used to 


interpret the highlights on other objects fn the image. 


So far, the use of context has been considered 
only on the tevel of object identification, Actually, 
context is even more necessary at the level of finding 
visual parts of objects, such as edges. A !ine-finding 
program can be saved an enormous amount of work If it is 
told approximately where to look, if a program thinks it 
is seeing an apple, it can know that a good way to verify 
this hypothesis is to look on top for a stem, 

A program can only make use of these cues, 
however, If it can pass information resu!tting from a 
partial Identification back to the low-level feature- 
finding routines. This sort of system shal! be referred 
to as "vertical", in the sense that control passes 
frequently between high and low-level! routines. The 
term "horizontal" refers to a system which works In 
stages, each of which produces a more abstract 
representation of the scene. Much of the previous work 
in visfon has been of this sort. A typical sequence 
might be to remove noise, enhance features, extract 
features, group them, and then tdentify objects. Since 
no provision Is made in a horizontal system for passing 
information back down this chain, the system cannot make 
use of context Information obtained from the Imaae 


itself. 


The methods which will be presented here are 
intended to fit into a vertical system in two ways. 

First, they can be used to start off a vertical! system 
with information good enough to get it going. Second, 
they extract features which are useful for object 
identification. These features will be extracted in 
such a manner as to a!low easy advantage to be derived 
from context information. 

This work is intended to be a step towards making 
computers see. This goal is tnteresting for a number of 
reasons. Computers with vision would be useful for 
applications In automation, and would be able to Interact 
better with humans. Computer vision may well provide 
instructive models for the understanding of human viston. 
The problem is also very interesting In Its own right, as 


an aspect of the study of Artificial Intelligence. 


Chapter 2 Previous Work 


Techniques have been investigated which could be 
appl led to smoothly curved objects as a step towards 


recognition. 


2.! Shape from Shading 


it is possible to find a great deal about the 
shape of a smoothly curved object from a single monocular 
image, given a knowledae of its surface reflection 
properties and the position and nature of the light 
sources, Horn L!10] generates curves lying on the 
surface of the object by an fterative solution of a set 
of differential equations relating shape to the intensity 
of image points. Similar methods have been applied to 
the analysis of lunar topography from Lunar Orbiter 
photographs [14,5]. 

This method requires a uniform object surface. 
Its reflectance must be a smooth function of the angle 
the surface makes with the incident and exit rays. Any 
marks on the surface will disrupt the solutions to the 


differential equations, although very small marks can be 


2.2 Detection of Optical Edges 


Much research has gone into the detection and 
tracing of contrast edges in an image. These edges can 
be emphasized by differentiation preprocessing 


operations, such as the gradient or Laplacian. 


2.2.1 Plane-surfaced Objects 


Edge detection is particularly attractive for 
plane surfaced objects. Since the edges are straight 
lines (the intersection of two planes), a determination 
of the position of the edges completely specifies the 
position of the plane surface which they enclose, and an 
edge itself can be located in terms of just a few of its 
points. 

A program by L. G. Roberts recognizes white plane 
surfaced objects on a dark background [15]. He considers 
objects which can be put together out of a set of given 
sub-shapes, such as rectangular parallelopipeds and 
wedges. The image is first differentiated. Lines are 
then found in the resulting picture by a multiple-step 
procedure, first fitting short lines to local areas, 


eliminating tiny toops, then fitting longer and longer 


lines to the shorter ones, and finally generating a 
least-mean-square line which is taken to represent the 
original edge. 

The next phase Is recoanition of polygons in the 
line drawing, followed by the matching of sets of 
polygons against the possible models. The matching is 
first done on a straight topographical basis. The two- 
dimensional projection of a brick, for instance, 
generally contains three quadrilaterals with one corner 
point in common. No such point exists on a wedge. 
Assuming, then, that this point corresponds to the corner 
of a brick, the program can match the other IInes and 
points in the quadrilaterals to what must then be the 
corresponding lines and points of the model. A least- 
mean-square error matrix procedure is then used to find 


the best brick (in 3-space) which generates the given 


two-dimensional fine drawing. (tf the least-mean-soauare 
error fs small enough, the fit is accepted as correct, 
When a set of lines are matched by a model, the 


model can then be projected back onto the line drawing, 
but now with all of the hidden I|ines present. The mode] 
Is now "removed" from the !ine drawing, which may entail 
the deletion of some lines, but also may entail the 


addition of some others. The procedure is now fterated 


until all of the tines of the input figure have been 
accounted for. Thus objects are recognized as being 
compounded of a number of the basic building blocks. 

Roberts depends on a high deqree of precision of 
measurement of the position of the edges, since he uses 
perspective in an essential way. Unfortunately, his 
procedure is useless for objects lacking straight line 
edges. One particularly interesting aspect of Roberts! 
work is his use of a powerful Internal model of the 
potential object in the image. A similar approach might 
be useful for scenes consisting of regular smoothly 
curved objects such as spheres and cylinders, but it fs 
difficult to enviston successful results using more 
amorphous forms, 

A program by R. W. Gosper visually locates white 
rectangular parallelepipeds on a black table. Due to the 
high reflectance difference between the objects and the 
background, the outer edges are very clearly defined. 
(The program also finds interlor edaes of the object 
where the contrast between adjacent faces is high 
enough.) The edges are found by an algorithm which scans 
in a line perpendicular to the edge, and moves this line 
along the edge from one end to the other. From the 


position of the edges in the image, and the knowledge 
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his use of second-order perspective effects. The use of 
stereo distance determination would also require such 
high precision. Gosper requires only medtum precision, 
His goal fs to actually pick up the block, which only 
requires locating it to within a centimeter or so. No 
perspective, stereo, or other second-order effects are 
used, so the calculated position ts not as sensitive to 
small errors in the line positton. The programs of 
Guzman and Griffith require only low precision, except In 
a few parts which make use of the parallelism of two 


lines. 


2.2.2 Curved Edges 


There has been much study of recognition of 
alphanumeric characters. Black characters ona white 
background provide high-contrast edges, and some 
character-recognition programs work by tracing around the 
character's edge. There has been tittle edge-oriented 
research on images derived from three-dimenstona |! 
objects, and the results of the two-dimensional! work has 
little relevance to this problem. 

It Is considerably easier to find a straight edge 


than a curved one, since only two points determine a 
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Straight line, and additional points can then be verified 
by very sensitive tests. 1f many tests are positive 
along a straight |ine, the existence of the edge can then 
be asserted with a high statistical confidence, as by 
Griffith's programs. These techniques can be used only 


over a short interval for a curved edge, 


2.3 The "Regions" Approach 


Instead of looking for high-contrast edges, some 
pattern recognition methods look for homogeneous areas of 
low contrast. Analysis then proceeds from the shape and 
Interelations between these "regions". There are a 
number of techniques for characterizing the shape of a 
region, such as various moments [2], or more comp! Icated 
shape descriptors [3]. Kirsch CI!] analyzes 
photomicrographs of cells by building a tree structure of 
image regions with various levels of homogeneity. His 
methods are the closest In the literature to those which 


are developed in this thests. 
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2.4 Textura! Information 


The optical behavior of an object depends very 
much on the texture of its surface. The word "texture" 
may refer to elther markings or departures from a smooth 
surface, but in either case they must be small compared 
with the size of the object tn order to be considered 
texture. Texure analysis may be done by a wide variety 
of methods, such as Fourler analystIs or cross- 
correlation, Texture has been used to advantage ina 
range of studies, in such areas asrecognition of terrain 
types [16] or cell images [13]. Different types of 


texture will be discussed further In section 3.9. 
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Figure 3.1: The Intensity-region Tree 
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Chapter 3 Representing an tmage as an Intensity-region 


Tree 


3.1 The Basic Method Used 


Consider an Image, I, defined on a rectangular 
raster of points, so that I(p) is the light intensity at 
the point p. For any given light intensity threshold t, 
define a set of points S(t) = dp |L( pap, the set of 
points of intensity t or greater. Each of the eight 
pictures in figure 3.1 (previous page) shows such a set 
of points, for some threshold. For any t, the set S(t) 
can be partitioned into disjoint connected subsets R,(t), 
which will henceforth be called "regions". Thus: 

s(t) = R,COLRACHL- - -UR,C, 

where Ri [| R j= (i) if I#j, and each R; is a connected set 
of points, Note that S(t+t,)Cls(t,) if tz>+,, so each 
region at threshold t, must be a subset of some region at 
t,. The regions thus fal! naturally tnto a tree 
structure basedon this subset relation, as shown fn 
figure 3.1. 

Another particularly graphic way of fooking at 


the tree is to visualize the intensity function plotted 


24 


In the form z=f(x,y). Slicing this function with a 
horizontal plane at several threshoid ltevels, the tree 
can be pictured as in figure 3,2. An Intensity contour 
map of the pear {is shown in figure 3.3 in order to show 


how the regions are actually nested. 
3.2 Quantization 


Choosing a set of threshold levels {+} is 
equivalent to quantizing the tight intensities in the 
image, in terms of the Information retained In the tree. 
The more threshold ltevels in the set, the greater the 
depth of the tree generated using these levels. We wil] 
generally consider threshold sets which are evenly spaced 
in the log of the light Intensity, although a tree could 
be generated from any arbitrary set of levels. Using the 
log of the light Intens!tty generates a tree whose 
structure remains basically the same if the [Illumination 


Is scaled up or down by a constant factor. 
3.3 Geometry of the Tree 


In the Ltmit of a continuous tree (In which the 


spacing between threshold levels approaches zero), the 
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Figure 3.2: The Region Planes Shown as Slices of the Intensity 
Function 


Light Intensity 
z= £(x,y) 
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tips of the branches represent local maxima in the image. 
Beginning at a branch tip and moving along it in the 
direction of lower intensity, the region expands from the 
maximum point to include other nearby points, assuming 
the intensity function Is continuous In that area. Each 
tree branch can thus be thought of as a growing region. 

A fork in the tree occurs whenever two or more of these 
regions combine, forming one new larger region. In this 
case, the branch associated with the sub-region of 
largest area shall be considered the "main branch", and 
the other branches shall be called "sub-branches". If 
the original image is stightly noisy, then as a reglon 
"expands" (moving along a tree branch from high to low 
intensity), ft will engulf taraqe numbers of smaller 
regions which appear ahead of its advancing edae, 
resulting in many short sub-branches on the tree. When 
two regions of substantial area are combined, ft is not 
really important which is considered the sub-branch. 

The highest region on the tree represents the 
brightest point in the Image. !f the threshold ts 
lowered far enough, all of the regions will eventually 
merge into one region containing all of the image points. 


This shall be referred to as the "root" of the tree. 
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3.4 Trees with Incomplete Region !nformation 


In the preceeding discussion, the regions 
themselves have been considered to be the elements of the 
tree. Let us now consider an abstract tree structure in 
which the elements of the tree are not the regions 
themselves, but nodes containing information about these 
regions. Such a tree shall be called an "Image Tree”. 

'f each node contains a complete description of the 
region to which it corresponds (that is, if Ry (ty) Is 
given for all { and tj» then the tree contains enough 
data to be able to re-construct the image exactly, to 
within the limits imposed by the quantization. 

if each node contains only statistics of the 
corresponding region, rather than a complete description 
of the region, then the tree contains less information 
than the original image. These are the interesting 
trees, despite the fact that the Image cannot be 
reconstructed from them, The problem of pattern 
recognition can be viewed as one of throwing away 
Information in a selective way. To go from a picture of 
an apple to the word "apple" represents an enormous 
reduction In Information ("a picture is worth a thousand 


words"), 
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In general, the nodes may contain any arbitrary 
set of functions of the corresponding region. In 
particular, the ones which will be used are the position 
of the region's center of mass (x~,y.), the area A of the 
region (i.e. the number of points In it), and a measure 
of the second moment about the center of mass, called the 
eccentricity e. 


The eccentricity is defined by 


: | 2 2 
e = rH (xXp-x-) + (Y¥p “Ye ) 
a 


il pts p 
in region 
e is 1.0 for a perfectly circular region, and is larger 
for amore elongated region. 
The eccentricity Is a dimensionless quantity, 


which remains the same {ff the region size is scaled up or 


down. I{t represents a normalized moment of Inertia about 
a line thru the region center of mass perpendicular to 
the region plane. It can be shown that no region can 


have an eccentricity less than 1.0, and that any shape 
other than a circle has a higher eccentricity. This is 
because a circle has the smallest moment of Inertia for a 
given area. 


For a | by f rectangle, the eccentricity is 
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e = /t + I 
6 f} , 


which is 1.047 for a square, !.3! for a 2 by | rectangle, 
and 2.23 for a 4 by | rectangle. For a high elongation 
tf, e = mt /6. 

Note that this definition of eccentricity is not 
the standard eccentricity of second order curves. The 
eccentricity of an elliptical region of semi-axes a and b 


Is 


which ranges from | to om, The normal definition of the 


eccentricity of an ellipse is 


which ranges from 0 to li, 

More complex region statistics could be stored on 
the tree. If the x and y second moments are stored 
separately, then the "dominent axis" thru the region 
center of mass can be easily computed. This is a line tn 
the plane of the region points through which the region 
has minimum moment of fnertia. Higher moments could also 
be computed, although their interpretation In terms of 
high-level! shape descriptors is less clear. More 
complete shape descriptors, such as the results of a 


Medtal Axis Transform [£41] could also be used, 
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The choice of more complex shape descriptors 
depends on the particular recognition tasks being 
performed. The simple statistics of area, center of 
mass, and eccentricity can yletd much useful Information, 
however, and attention will be focused on them. It willl 
be seen that they are quite useful for the analysis of 


surface properties and simple shapes. 


3.5 Sub-programs of the Image Tree System 


Programs have been written to obtain the image 
tree of a given scene, Measurements from a laboratory 
scene are read into an array by an Image-dissector 
camera, and a l!ist-structure tree Is generated, The tree 
can be printed out, showing the parameters associated 
with each node. Programs also can graph against the 
threshold any region statistic stored on the nodes, along 
some path on the tree from a branch tip to the root, The 
original Image can be displayed, and any arbitrary reaion 
can be shown superimposed upon it. For more detail about 


these programs, see the appendix, 
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3.6 The Tree of a Matte Sphere 


Let us consider the tree resulting from an image 
of a sphere with a matte surface. A matte surface 
exhibits a reflectance which is fairly uniform In all 
directions regardless of the angle of the incident light. 
The image of a sphere is a circle. If we assume the 
reflectance to be completely uniform, and consider a 
sphere lit from the camera position, then the intensity 
as a function of radius r over this circle ts 


2 1/2 
I(r) = [1 = (r/R) J, 


where R is the radius of the projected circle, and the 
intensity is normalized to | at the central! point. This 
formula simply expresses the fact that the projection of 
a surface seen by a viewer is proportional to the cosine 
of the angle of the viewer from the normal to the surface 
(see figure 3.4). Thus, assuming uniform scattering, the 
intensity of the light Is proportional to the cosine of 
the incident (and viewing) angle. The Intensity vatue 
actually read from the vidisector is t = C + 32log(T), 
where C is the reading at the central point, | is the 
intensity, and the log is base 2. Solving for the region 


area as a function of the threshold t, we get 


Figure 3.4: Formula for the Reflectance of a Sphere 


The sphere is lit from the camera position. 
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(t-C)/16 
A= BCI - 2 "eg 

where B is the area of the full circle. Its Image tree 
should have only a single straight branch, whose tip 
corresponds to the central point, Each of the nodes on 
this branch represents a circular region centered about 
this point, 

A picture of a white sphere on a black background 
was actually read into the computer from the vidisector, 
and a tree was generated by the procedure previously 
described. The tree had essentially one main branch, 
although there were a few very short sub-branches 
representing regions of very small area, which were 
neglected, The measured region area and the theoretical 
curve are plotted together In figure 3.5. 

Note that the measured curve rises considerably 
above the theoretical curve in the central region. This 
impties that the intensity is not linear In cosine of the 
Incident angle, but is somewhat convex, as in figure 3.6. 
The sphere used for these studies had an extremely matte 
surface, and hence a negligible highlight. The sudden 
rise at the end of the curve is due to the threshold 
lowering to below the intensity of points in the black 


background, 


Figure 3.5: 
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Figure 3.6: Actual and Assumed Surface Reflectance 
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3.7 Effect of the Specular Component 


As was discussed in chapter 2, the reflectance of 
a surface can be considered to be a superposition of a 
specular and a matte component. A mirrored sphere would 
give rise to a pure specular reflection, which would 
clearly be an image of the light source, plus a 
reflection of anything else in the room. If the surface 
is not highly mirrored, this specular component will be 
greatly attenuated, so that it can be neglected, except 
for the image of the bright tight source, which will be 
significant despite the attenuation. This reflection of 
the light source is called a "highlight", and will 
generally be considerably brighter than the surrounding 
points. The magnitude of this highlight relative to the 
matte component Is a measure of the specularity of the 
surface. 

Consider the effect of this highlight on the 
image tree, assuming the I fight to come from a small 
(nearly point) source. This will produce a small, bright 
spot on top of the local maximum in the matte component. 
As a result, a long section of the tip of the tree will 
represent a small! region of fairly constant area. This 


is a result of the "spike" in the light intensity 
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function resulting from the small, bright highlight. 
Consider the set of spheres shown in figure 3.7. 
They were all painted with a matte white paint, and then 
coated with zero through seven coats of clear enamel, 
giving them varying degrees of specularity. A graph of 
the region area vs. threshold (figure 3.8) shows the 
small flat section of the curve representing the 
hightight, for one of the spheres. Figure 3.9 gives this 
highlight depth has a function of the number of coats of 
faquer, illustrating how the surface specularity can be 
measured In a simple manner. The frregularities In this 
curve are probably due to the difficulty In applying the 


coats of laquer uniformly. 


3.8 The Surface Convolution 


Locally, consider a curved surface to be a part 
of a sphere of the same radius of curvature. According 
to classical optics, a spherical mirror has a focal! 
length of one half [ts radius R, and will forma virtua! 
image of the light source as shown fn figure 3.10. If a 
light of diameter d and distance L from the object is not 
too far off the camera-object axis, then the diameter of 


its image Is about 
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Figure 3.7: Specularity Test Spheres 
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Figure 3.8: Illustration of Highlight Depth 
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Figure 3.9: Highlight Depth vs. Number of Laquer Coats 
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Figure 3.10: The VirtuAl Image Made by a Spherical Mirror 
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(As R-e op, d!'—+d, as is indeed the case for a flat 
mirror.) Thus if the size of the light source and the 
approximate distance of the object from the camera are 
known, the curvature of the surface can be determined 
near a highlight. Even if the size of the light source 
is not known, this method gives the relative curvatures 
if there are several different hight ights [In the scene. 

A good way to determine the size of the source [s to take 
advantage of verticality by knowing the approximate 
curvature of some object in the image. 

Many surfaces will "smear out" the image of the 
light, resulting in a broader highlight than would be 
gotten from a mirrored surface of equivalent curvature. 
The highlight seen can be considered to be the 
convolution of the image of the light source and the 
"impulse response" of the surface reflectance. If the 
light source jis a sufficiently small point, then Its 
image can be considered to be an Impulse, and the surface 
"smear" function can be read directly from the region 


area vs. threshold curve. 
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3.9 Texture 


"Texture" refers to variations in the light 
intensity which are very small [In size compared to the 
objects belIng recognized. It has two basic causes, 


"Visual texture" is due to variations In the reflectance 
of the surface, and "tactile texture" is due to minute 
protrusions or depressions supertmposed upon a basically 
smooth surface (the sort of texture one can feel with a 
finger). If the size of the texture Is smaller than the 
resolution with which the image has been samp!ted, the 
intensity variations will averaae out, and the texture 
will have little effect on the tree, aside from affecting 
the surface "smear" function. If the texture Is large 
enough to be discernable, however, it witl produce a 
distinctive effect on the tree. 

Texture is a multie-dimenstonal feature, and there 
are a correspondinly large number of textural properties 
which could be measured. We are not concerned here with 
producing a complete description of texture, but rather 
with detecting features which might be useful in making 
an object fdentification. Although such features can 
help discriminate between objects, they do not give 


enough information to re-construct the texture exactly. 
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3.9.1 Visual Texture 


Consider the two spheres shown in figure 3.11. 
The spheres were painted with a matte white paint, then 
marked with red ink to produce visual texture. The same 
two spheres are shown In red, white, and green Jight. 
Since the red Ink is highly reflective In the red, and 
very absorptive in the green, these lighting conditions 
produce tight, medium, and heavy texture contrast 
respectively, with all other factors being held constant. 

There are two kinds of texture, with respect to 
effect on the image tree. The right sphere shows small 
disconnected light patches on a connected dark 
background, and the left sphere shows disconnected dark 
speckles on a connected Iight background, A light spot, 
being a local maximum in the light intensity, wil! 
produce a tree branch. The nodes on this branch wil! 
represent regions the size of the spot, and so will have 
very small area. The length of the branch will depend on 
the relative brightness of the spot compared to its 
neighbors, since when the threshold reaches the Intensity 
of the neighbors, the region corresponding to the spot 


will be swallowed up by the larger region surrounding it. 


Figure 3.11: 


Texture Test Spheres 


High contrast 
(green light) 


Medium contrast 
(white light) 


Low contrast 
(red light) 
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Light speckles will thus produce a large number of sub- 
branches whose length represents the Intensity of the 
speckle, and whose "size" (the size of the corresponding 
regions) represents the size of the speckles. The tree 
corresponding to the ltight-speckled sphere photographed 
in the green \ight (deepest texture) Is shown in 

figure 3.12, Note the many branches produced by the 
speckles. 

The number and length of the sub-branches 
provides a measure of the degree of contrast of the 
texture, These quantities are shown In figure 3.13 for 
the Ilght-speckled sphere under the three lighting 
conditions, Note how these quantities thus provide an 
index of texture contrast, just as the highlight depth 
and surface smear function provide an ftndex of 
specu larity. Information about the details of the 
texture can also be obtained, up to the Ifimits imposed by 


the particular shape descriptors used on the nodes of the 


tree. Round speckles will produce regions of low 
eccentricity, whereas streaks will produce regions of 
very high eccentricity. If the direction of the dominent 


axis of the region were recorded (corresponding to 
recording the second moments In the x and y directions 


separately), the dominent axis of the streaked texture 


Figure 3.12: Tree of the Light-speckled Texture Test Sphere 
(green light) 


All sub-branch nodes represent regions of small area. 
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Figure 3.13: Number and Average Depth of Sub-branches for the 
Light-speckled Texture Test Spheres 
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could be determined as well. 


Dark speckles will have a different effect, 
however, Since they are local minima in the Intensity 
function, rather than local maxima, they will not produce 
branches on the tree, but rather will produce holes In 


regions. This is shown by the tree of the dark-speckled 
sphere, shown In figure 3,14. The only effect of these 
smal! holes its to raise the eccentricity of the growing 
region, as shown in figure 3.15, which shows the matin 
branch eccentricity vs. region area for the dark-speckled 
sphere in the three different colored tights. Since the 
eccentricity change Is so small, these three curves can 
be compared in this way only because all factors except 
the degree of texture were held absolutely constant - the 
same sphere was viewed from exactlv the same camera 
position and with exactly the same Ifight source, Nothing 
was moved; only the filter over the !Ight was changed. 
The difference between the trees for the dark 
speckled and the fight speckled spheres (figures 3.12 and 
3.14) exposes a basic asymmetry In the Image tree with 
respect to light and dark. This asymmetry ts not just 
confined to texture, of course. Locally bright areas 
will always produce regions and hence tree nodes, while 


locally dark areas will always produce holes In regions, 


Dark-speen ed 
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Texture 


Test 


Sphere 


Figure 3.15: Eccentricity vs. Region Area for the 
Dark-speckled Texture Fest Spheres 
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altering the statistics of nodes that would otherwise 
exist anyway. 

The tree could easily be extended to find dark 
speckles by generating an "inverted" tree for the area 
Inside each region. An inverted tree {is a tree fn which 
the regions represent image areas tess than threshold, 
instead of greater than or equal to, This will be 


further discussed in section 5.4.2. 


3.9.2 Tactile Texture 


Small bumps on the surface of an object 


essentially produce many tiny "micro-objects" with the 


same surface properties. If the size of these {is below 
the resolution of the Image sampling, the effect will be 
only on the surface smear function. If the texture Is 


larger than that, and the surface Is fairly specutar, the 


result will be many tiny highlights, producing the 
equivalent of a light-speckled visual texture, 
3.10 Shape 


The image tree carries shape Information in two 


ways: in Its form, and in the behavior of the region 
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statistics stored along its branches. The Interpretation 
in terms of object shape of the simple region statistics 
discussed so far depends upon the object being simply 
shaped, since the eccentricity does not give enough 
information to distinguish between different comp! ex-~- 
shaped regions. Nevertheless, much useful shape 
informatlton can be obtained even with very simple 
statistics, particularly In a recognitlon-oriented 
application in which there can be restrictions on the 


shapes considered. 


3.10.1 The Main Branch 


Consider the object shown In figure 3.16. Its 
tree fs a single main branch, just as In the case of a 
sphere (a crude contour map Is shown In figure 3.17). 
The simplest indicator of its shape Is the eccentricity 
of the entire object, which is about !.4, clearly 
Indicating it to be quite elongated. The entire curve of 
eccentricity vs. threshold fis shown In figure 3.18. The 
flatness of this curve Indicates that the region probably 
doesn't change its shape very much as It grows, and that 
it has a smooth surface with no significant 


irregularities, This Is not a unfque interpretation of 


Figure 3.16: 


A Matte-white Painted Squash 
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Figure 3.17: 
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Figure 3.18: Eccentricity Curve of the Squash 
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the curve, but is a reasonable inference given the 
assumption that the object is not highly firreguiar. The 


bump in the eccentricity curve at the bright end fs 


typical of a small newly developing region, Since the 
slope of the light intensity function fis very smal! near 
a local maximum, a small region about that point will 


tend to have jagged edges, and hence a high eccentricity. 
As the region expands, the intensity gradient at the edge 
increases, so the edge becomes straighter, and the 
eccentricity is reduced. 

Consider the plot of added region area, shown in 
figure 3.19. This quantity shows the excess area added 
to a region above the sum of the areas of its sub- 
regions. Since the intensity measured is a monotonic 
function of the angle of the surface to the camera, the 
added region area is the projected area ot that part of 
the surface on the object with a particular slope. A 
bump in this curve represents a large area of relatively 
low curvature. The only one In this case is near the 
hightight. 

Figure 3.20 shows what the area added to a region 
looks tike - it is the area of a region minus the area of 
all its sub-regions. Note that the statistics used are 


such that from the statistics of a region A and those of 


Figure 3.19: Added Region Area Curve of the Squash 
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Figure 3.20: Illustration of an Added Area Region 
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a sub-region B, the statistics of the difference A-B can 
be computed. (To compute the eccentricity of region A-B, 
the eccentricity, region area, and center of mass 
position of regions A and B must all be known.) 

Computing information about the shape of such a 
difference region gives information about bulges 
developing in a region, direction of motion of the center 
of mass, and other properties of all those points on the 
surface within some given range of inclination to the 
camera. 

The added area curve would have two peaks for the 
hypothetical object shown In figure 3.21, due to the low 
curvature of the annular region Indicated. In this case 
the eccentricity would be constant at !.0 and the center 
of mass position would be Spee tense, since the reotons 
would al! be concentric circles due to the rotationa| 
symmetry. For the pear-lfike object In figure 3.22, the 
protrusion would also increase the added area curve, but 
in this case, the eccentricity would increase as well, 


and the center of mass would shift. 
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Figure 3.21: A Symmetrical Object with Two Added Area Peaks 
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Figure 3.22: 


A Contour Map of a Hypothetical Object with a 
Protrusion 


"Protrusion" 
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3.10.2 Sub-branches 


Protrusions of the sort illustrated in 
figure 3,22 will often produce significant sub-branches 
on the tree, The meaning of a sub-branch must be 
interpreted in conjunction with the information stored on 
it, and on the main branch to which it attaches. The 
attachment of a protrusion region, for example, will 
generally produce a rise in the eccentricity of the main 
region, and a shift in its center of mass. The possible 
interpretations of a sub-branch depend very heavily on 
the particular identification for which the tree is being 
used, A discussion of the interpretation of shape 
information for a particular set of test objects will be 


given in section 4.2. 


3.10.3 Non-interference of Texture with Shape 


Figure 3.23 shows graphs of the region area for 
the speckled spheres of figure 3.11, normalized to the 
light intensity. These graphs illustrate that the basic 
shape-describing parameters are not affected by object 
texture in a significant way. This Is basically due to 


the averaging nature of the reglon descriptors used, 
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Figure 3.23: Region Area Curves for the Light-speckled 
Texture Test Spheres 
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This insensitivity te tertural interference is a qreat 
imorovement over most srevious methods used on curved 
obiects, such as Korn's analytical method, which is 
completely useless in the presence of texture, Edge~ 
finding methods ere alse confused by sharp texture, This 
advantage is very important in the recoanition of real 


objects, as wilt be seen in the next chanter, 
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Chapter 4 Use on Real Objects 


4,1 Pruning 


Regions generated by smooth objects with smooth 


surfaces should In theory always have smooth boundaries, 


In an actual image, however, minute surface fluctuations 
and noise will cause the edge of the region to be highly 
irregular. If the Irregutaritles are great enough, small 
sections of the region will be detached; that is, they 
will actually form separate smal! regions. Since the 
area separating these sma!! regions from the edge of the 


nearby large region is only slightly dimmer than the 
region points, these small regions will join the matn 
region at a threshold only sttght!y tower than that at 
which they started. They will thus produce very short 
branches on the image tree, whose regions are of smal! | 
area, These regions are essentially artifacts of the 
particular levels at which the threshold is placed, and 
thus have no particular significance, In order to avoid 
the waste of space and time needed to store and analyze 
these branches, they can be "pruned" away as the tree Is 


generated, This Is done simply by removing branches 


Figure 4.1: 


Apple 
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Figure 4.2: Region Area of the Apple 
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Figure 4.3: Region Center of Mass of the Apple 
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Figure 4.4: Eccentricity of the Apple 
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Figure 4.5; Tree of the Apple 
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level recognition routines could take advantage of this 
fact to help find stem areas. 

Now consider the pear shown in figure 4.6. Its 
tree, shown in figure 4.7, Is topologically simitar to 
the tree of the apple, inctuding a smal! sub-branch with 
significant area, The graphs of the various parameters, 
however, shown {[n figures 4.8, 4.9, and 4.10, reveal! that 
this sub-branch has a different interpretation than in 
the case of the apple. First, its center of mass shows 
it to be positioned to the left of the main region, 
rather than directly above It. Second, at the point at 
which the two branches join, there is a rise In the 
eccentricity in the case of the pear, whereas there Is 
not in the case of the apple. Finally, the eccentricity 
of the apple just before breakthrough into the background 
was near 1.0, whereas the eccentricity of the pear is 
about 1.2, which is significantly higher. tnformation Its 
also available concerning the surface properties of the 
pear, The pear's highlight shows a wider "impulse 
response", which indicates that Its surface, although 


somewhat shiny, fs not as highly specular as the apple. 


eee eT ee 


Figure 4.6: 


Pear 
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Figure 4.7: 
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Figure 4.8: 
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Figure 4.9: Eccentricity of the Pear 
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Figure 4.10: Resion Area of the Pear 
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4,3 Useful Features for Fruit Recognition 


We will now attempt to list some features which 
can be easily extracted from the image tree, so that the 
classification of fruit may be systematized. This list 
is not intended to be exhaustive. In fact, quite to the 
contrary; it is Intended to show that recognition of 


fruit Is possible with only a few very simple features. 


4.3.1 A Sample Set of Fruit 


In the course of studying the {image tree method, 
a large number of fruit were processed to study the 
effects on real images. In addition, a large number of 
fruit were given identical processing under tdentical 
conditions one day in order to gather some statistics on 
the various features which can be extracted. Photographs 
of the fruit in this sample set are shown In figure 4.1/1. 
The fruit used were Bartlett pears, Macintosh apples, 
sweet pears, and oranges. The test images include five 
views each of the Bartlett pears for a total of 25, two 
views each of the apples (total 10), three of the sweet 
pears (total 1!5), and one each of the oranges. Three 


taped Images of peaches are also Included In the sample 


Figure 4.11: The Fruit in the Sample Set 
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set, although they were recorded under different 
circumstances, Peaches were unavailable at the time the 


sample set was run, 


4.3.2 Specularity 


As was discussed in Section 3.7, the "impulse 
response" of the surface can be approximately obtained 
from the region area vs. threshold curve at a branch tip. 
We would !ike to characterize this curve In order to 
extract some significant features that are useful for 
recognition purposes. One way to do this Is shown in 
figure 4,12. At the branch tip, the second derivative of 
the region area curve is positive due to the specular 
component, but negative due to the matte component, A 
straight line fitted to the curve at the tInflection point 
is shown, extended to Intersect the axis. The 
intersection point is called the "matte intercept", The 
value of the curve above this intercept is used as a 
measure of the width of the surface function, as shown on 
the figure. It is called s, for the highlight "smear" 
width. 

Another measure of the surface function is the 


amplitude of the highlight, also marked in the figure. 


Figure 4.12: 
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This can be measured in various ways, but is here 
measured as the amplitude of the highlight above the 
matte intercept. 

A scatter diagram of the smear width s vs, the 
highlight amplitude h is shown in figure 4.13. Note that 
the peaches, apples, and orange are separated very well 
by their highlight properties, but that the two types of 
pears not only have similar properties, but also show a 
very high degree of variation in these parameters. This 
is partly because their surfaces are rather lumpy and 
uneven, which disrupts the high! Ight region. As will be 
seen later, this unevenness can be used to help identify 


them, 


4.3.3 Simple Global Properties 


Two very simple properties of a fruit are its 


brightness and its size. These are both properties which 


are useful only relative to some additional tntormation 
not contained in the image alone; specifically, the light 
intensity and the object's distance from the camera. It 


this information is available, these two features can 
contribute recognition information. These quantities can 


be obtained, in many cases, from other known objects in 


Figure 4.13: Smear Width vs. Highlight Amplitude 
for the Sample Set 
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the image, in the experiment described next, the sample 
fruit were all viewed with the same light intensity and 
at the same distance from the camera, so that their 
intensity and size are comparable, 

The brightness of an object is taken to be the 
intercept of the straight line approximation to the matte 
component with the line of zero region area, thus 
estimating the brightness of the surface If there were no 
highlight. The overa!! area is estimated by scanning up 
from the root of the tree until the first local minimum 
in the slope of the region area curve is found. The 
region area of this node is taken as the object's 
projected area (see figure 4.12). 

A scatter dlagram of these two quantities is 
shown in figure 4.14 for the sample fruit. They are 
clearly not very useful for distinguishing between the 
fruit in the sample set. They would be very helpful if 
very large objects such as watermelons were tncluded, 
however. 

Another optical feature which could be used is 
color, which would be very powerful for fruit. This 
feature was not studied in our experiments, because the 
processing of different color images of the same dbiect 


would have added complexities and delays without much 
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Figure 4.14: Brightness vs. Overall Area for the Sample Set 
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added understanding of the fmage tree, 


4.3.4 Overall Shape 


Our simplest shape descriptor Is just the 
eccentricity of the entire fruit outline region, which Is 
shown plotted with the hight ight depth In figure 4.15, 
This parameter alone will identify a banana, which has 
not been inctuded in the sampte set. Note that oranges 


and apples are extremely round. 


4.3.5 Sub-branch Types 


So far, we have used only Information extracted 
from the main branch. Many properties of an object 
produce sub-branches. In understanding an image we must 
figure out what these sub-branches represent. Some types 
of sub-branches will now be discussed, and a simp!e sub- 


branch classification algorithm presented. 


4.3.5.1! Tactile Texture 


The oranges In the sample set supply good 


examples of tactile texture. A close examination shows 


Figure 4.15: Object Eccentricity vs. Highlight Depth 
tor the Sample Set 
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their surface to be covered by small bumps and valleys. 
Since the surface is atso highly specular, this 
graininess produces myriad small highlights, as discussed 
in section 3,9,2,. These produce small short branches on 
the tree. Textural branches represent regions of small 
area, and are near the tip end of the tree. The number 
of sub-branches on a tree fdentified as textural by the 
classification algorithm shall be denoted by the 


vartable T. 


4.3.5.2 Stems 


The Bartlett pears show large, fong, light- 


colored stems, The branches produced by these stems are 


easily identified by their small size and large 
eccentricity. The number of stem branches fs denoted 
by S. 


4.3.5.3 Protrusions 


A pear is basically a spherical shape with a 
protruding bump. These protrustons will frequently 
produce a major sub-branch on the tree, as in the case of 


the pear discussed In section 4,2. Such protrusions 
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generally have a large area, and usually produce a 
significant jump In the eccentricity of the main branch 
at the point where they join It. The number of 
protrusions will be denoted by the letter P (usually 


Oor 1). 


4.3.5.4 Stem Hollows 


An apple has a somewhat conical depression on top 
in the spot the stem is attached. The stem Itself is 
smaller and darker than tn the case of the pear, This 
stem hollow will often produce a separate branch on the 
tree, as the light reflected from the back of the hol low 
is surrounded by darker pofnts on the rim of the hollow. 
Furthermore, the dark stem will often bisect this region, 
producing two sub-branches. Thus a significant sub- 
branch which causes a drop In the main branch 
eccentricity when It joins Is !tikely to be a stem hollow, 
and this is reinforced if there Is another simitar region 
nearby. The number of stem hollow regions Is denoted by 


the letter H (usually 0, | or 2). 
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4.3.5.5 Surface Irregularities 


There are frequently a number of branches which 
do not fall into any of the above catagories. These 
often are due to irregularities In the surface of the 
object. These irregularities are larger than what is 
called tactile texture, but smaller than those large 
enough to be called protrusions. The number of such 


branches shall be denoted by the fetter I. 


4.3.6 Sub-branch Classification 


A very simple algorithm was written to classify 
sub-branches, It is shown In flow chart form tn figure 
4.16. The parameter A represents the area of the sub- 
branch just before fit joins the main branch. The 
parameter Ae is the change In the eccentricity of the 
main branch at the point where the sub-branch joins. Ae 
is positive if the sub-branch produces an increase in the 
eccentricity, and negative if it produces a decrease, 

The parameter j tells where on the main branch the sub- 
branch is attached, on a scale from 0.0 (matte intercept) 
to 1.0 (full object). If the sub-branch joins the main 


branch In the high! ight region (above the matte 
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Figure 4.16: Sub-branch Classification Algorithm 
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Figure 4.17: Object Identification Algorithm 
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having lots of texture branches and being very round. 
Apples show stem hollows and are very round, A stem area 
identifies a Bartlett pear Immediately. The two types of 
pears are sorted out on the basis of their eccentricity, 
the number of protrusion branches, and the number of 
irregularities. Round objects with essentially no 
highlights are peaches, 

The flow-chart shown correctly identified all of 
the fruit with the exception of one Bartlett pear (BPI!) 
which was identified as a sweet pear. The pertinant data 
for each of the sample fruit are shown In figure 4.18. 

Our conclusion is that recognition of images of 
single fruits is relatively easy, using the Image tree, 
The Image tree allows the easy extraction of enough 
Information about surface properties, shape 
Irregularities, and general shape, as well as helping to 
spot spectfic characteristics such as stem hollows and 
stems, and the procedures which extract this Information 
are reasonably simple. More complex routines which take 
the trouble to look more closely at the tree's statistics 
should be even more reliable. 

The recogn!tion procedures described would be 
disrupted (as would many others) by occlusions, shadows, 


missing stems, and object positions which hide 


Figure 4.18: 


Eccentricity and Sub-branch Types for the 
Sample Set 


fb BP11 incorrectly identified 


Bartlett Pear S: Stem 

Sunkist Orange H: Stem Hollow 
Sweet Pear P: Protrusion 
MacIntosh Apple I: lrregularity 


Peach 


T: Texture 
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significant features, Many of these problems could be 
eased by a suitable vertical system, which could use 
other knowledge to explain and correct changes In the 
image tree, Other problems can be solved without higher- 
level aid, simply by making the recognition routines more 
clever. For example, occlusions can generally be 
detected by the way in which two regions connect. Once 
an object is known to be partially occluded, corrections 
can be made to its region statistics which give an idea 
of its form, under the assumption that the visible and 
the hidden parts are similar. 

Even in the presence of severe occlusion 
problems, the tree still gives valuabie local information 
about highlights and texture. Although the stems gave 
significant aid in identifying Bartlett pears, the stems 
were not seen in ten of the test cases, yet nine of these 


were correctly Identified. 


4.4 Faces 


This section illustrates the behavior of the 
image tree produced from a more complex smoothly curved 
object: a human face, It is tnctuded to show another 


example of a real recognition task for which the image 
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tree is potentially useful, A tree was generated from an 
image of a face, seen full face and Jit trom the front. 
This tree ts shown in figure 4,19. Branches of the tree 
have been labeled with the tocal maxima on the face to 
which they correspond, and the shapes and positions of 
these regions {fs shown in flgure 4,20. These regions 
might be useful for face recognition, at least for the 
simple angle of view and lighting considered here. 
Contour maps ata single level! of the tree are 
showh in figure 4.21, for each of two levels (marked in 
figure 4.19). At level 313, most of the major regions 
seen in the photo appear, with the exception of the lower 
lip hightight, which fs considerably dimmer. The contour 
map at level 268 [Is rather Interesting. Consider not the 
region included within the contour, but the area 
excluded. This includes most of the mouth, the eyebrows, 
the eyelids (the eyes are closed), the nostrils, and a 
shadow area on either side of the nose. These are 
locally dark areas In the Image. These could be isolated 
by making an Inverted tree =~ that Is, by making a tree 
with the Image negated. These locally dark areas are 
probably better places to begin face location, since 
there are fewer of them than there are focally bright 


areas, and they are more prominent. Indeed, there are 
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Figure 4.19: Tree of a Face 
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Figure 4.20: Some of the Regions of the Face Tree 


Regions corresponding to boxed nodes 


Face alone Regions superimposed on face 
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Figure 4.21: Slices of the Tree at Two Thresholds 


Threshold 268 Threshold 313 
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experiments which Indicate that as babies learn to see 
faces, they first fixate on the pair of eyes [1,6]. Once 
a face is roughly located, higher level routines can make 
sense of the locally bright areas with less difficulty. 
Figure 4,22 shows a contour map with both levels 
superimposed, with the dark regions shaded. 

Note that the Image tree can easily be used to 
isolate facial features and determine their approximate 
position. In order to better characterize their shapes, 
more complex shape descriptors would probably be needed 
than those which have been used so far. The image tree 
can be used to characterize the shapes of objects, such 
as noses, which have no "hard edge" boundary. This wil! 


be further dtscussed in section 5.2.4. 
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Chapter 5 Discussion 


5.1 Comparison with Previous Work 


The image tree can now be situated among the 
pattern recognition methods discussed In chapter 2. It 
is a "regions" method, rather than an edge detection 
scheme, and does no differentiation or other pre- 
processing of the image. It extracts Information about 
both the surface properties of an object and about its 
shape, It does not require any high degree of precision 
of measurement with regard to the exact location of 
specific points in the image, and does not make any 
essential use of perspective tnformattion. it does not 
attack problems of the "parsing" of an image into Its 
component parts directly, although [It may ald this 


process by the way it organizes the image information. 


5.2 Advantages 


The image tree has a number of advantages for 


pattern recognition over many previously used methods, 
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will represent the entire scene, and various sub-branches 
witl represent sub-parts, and then sub-parts of the sub- 
parts. The tree can thus be thought of as providing a 
range of measurements of differing degrees of aculty. 
These notions of pattern recognition as a sort of 


"measuring" problem are due to Kirsch, 


5.2.4 Objects Without Boundaries 


The image tree is easy to apply to the 
recognition of objects without real edges or well-defined 
boundaries, such as a nose, or an object lit so that one 
side fades off gradually [Into shadow. Assuming the 
object produces a separate tree branch, it can be 
analyzed from the data at the tip of the branch, working 
down towards the base until the parameters indicate that 
the region Is taking In too much extraneous area to be 
useful, Thus some information about a nose can be 
extracted even though it has no well-defined upper 
boundary, because [ft has well-defined lower and side 
boundaries. This simple task can be rather comp! icated 
for edge-orlented procedures, or for programs which are 
regions oriented but which do not make a serles of 


related measurements at different levels, as In the tree. 
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By the same arguments, the tree wil! contain Information 
about a smoothly curved object even If it fs partially 
obscured, provided it contains a local brightness 

max imum. The procedures which analyze the tree must be 
able to detect the occlusion and to try to compensate for 


it. 


5.3 Problems 


The separation of coarse and fine information Is 
not always maintained by the tree, unfortunately. When 
branches representing two different objects merge, 
information about those parts of the object not yet 
ftlled out by the region may be lost. If a small 
highlight area is swallowed up by a larger reaton before 
achieving much depth In fts own right, the Information 
that would have been obtained about the local surface 
properties of that area are swamped out. When a region 
representing some object In a scene joins with a larger 
region representing the background, the Information about 
the smaller object is lost. One case in which this can 
occur {Is when a dark object Is on a light background, or 
near a lighter object. Or, alternatively, a region may 


extend beyond the boundaries of an object on one side 
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betore reaching the boundary on the other side, possibly 
due to an overall gradient tn the tight Intensity. This 
shal! be referred to as a “breakthrough", Although it 
can usually be easily detected by Its effect on the 
region parameters (sharp rise In the region area and 
eccentricity, and sudden shift in the center of mass), It 
still means a loss of information about the side of the 


object which the region has not "filled". 


5.4 Further Considerations 


5.4.1 Other Statistics 


So far region shape has been characterized by the 
region area, eccentricity, and center of mass position. 
There are many other region statistics which could be 
used to characterfze the regions, depending upon the 
particular recognition task at hand, 

One very simple addition which could be made 
would be to compute the x and y second moments 
separately, so that the major axis of the region could be 
found, This is the axis about which the region has a 


minimal moment of [nertia. This wou!td allow the tree 


predicate, there Is always the possibility that a high 
eccentricity may be due to a perfectly round region, but 
with a large hole in the middle. 

In general, any sort of shape-descriptor 
algorithm can be applied to the regions, such as the Blum 
algorithm (Medial Axis Transform) [3]. 1! belleve, 
however, that one of the strengths of the image tree as a 
method is to allow easy recognition with relatively 
simple region shape descriptors. Using very complicated 
descriptors not only will consume a great deal of 
computer time, but will also complicate the analysis 
required of the higher-level programs. A more detailed 
shape analysis should probably be reserved for cases in 


which problems arise in the simpler procedures, 


5.4.2 Regton-hole Duality 


The tree procedures are not symmetric with 
respect to light and dark, as has been pointed out 
earlier. Thus a black spot on a IIght object is not 
percelved as an object, but as a hole in a region, 
Furthermore, these holes are not detected by the 
programs, and iInsuffictent information fs stored on the 


tree to tell that they are there. Thus the effect of a 
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hole Is to decrease the region area, and increase the 
region eccentricity, but it is not detected as a hole per 
se. in the detection of texture, black speckles have a 
completely different effect than white speckles. An 
object is harder to recognize on a white background than 
on a dark one. 

This is not a desireable situation. An object 
should be easy to recognize on any highly contrasting 
background, regardless of whether it fs darker or lighter 
than the object. A possible solution would be to make 
two trees, one with the image negated. Thus one would be 
the tree already discussed in detail, and the other would 
be a tree of dark regions on J/ighter backgrounds, jin 
which the tips of the branches would represent locally 
dark areas, rather than locally light ones. For the face 
considered in section 4,4, these dark branches would 
represent significant locally dark areas, such as the eye 
sockets, the nostrils, and the dark areas along the side 
of the nose. The eye sockets and the nostrils, In 
particular, are probably very Important itn orienting 
visuatly with respect to a face, 

There is no reason why this procedure should not 
be carried to more than one ltevel, Whenever a region fs 


Isolated, the contiguity scan routines could be called 


again, but scanning only inside the region, and with 
their sense inverted, so that they would find holes. 
Small holes could then be eliminated, but if there were 
any large ones, they would be noted on the tree, 
Furthermore, the sense could then be Inverted once more, 
and the contiguity scan tried once agaln to find 
additional light regions Inside the dark holes. 

This procedure would succeed In finding a dark 
apple on a light background, The apple could be isolated 
by an inverted run of the tree procedures, and then the 
normal procedure could be carried out on the region thus 


lsolated. 


5.4.3 Complex Lighting 


In the above discussion, It was assumed that the 


itlumtnation was coming from a single point source. 


Changing the source of the Illumination wll! change the 
properties of the highlight region, but will not alter 
the basic properties of the tree. If the tifumination ts 


from a diffuse source, specularity Information is lost. 
Light from several point sources will produce multipte 
highlights. If the high level parsing routines know 


about the light source, they can compensate for these 
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effects. By making hypotheses about the objects in the 
image, these routines could equally well find out about 


the lighting from the image. 
5.4.4 Isolations of Regions 


A by-product of the Image tree Is the Isolation 
of regions which can be used as data for other feature 
extraction programs. One might, for example, take a 
fairly targe region around the highlight, subtract out 
the small region containing the hight ight Itself, and 
hand this difference region to a textural analysis 
program. This program could use this region to extract 
texture information in various ways, such as performing a 
Fourier transform, autoconvolution, or sImi lar 
processing, obtaining Information about surface speckles 
not available directly from the tree. ietie a region 
generated from one of the tree nodes helps assure that 
the portion of the Image upon which the analysis is 


performed Is a sultable one. 
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5.5 Summary and Conclusions 


A procedure has been out!tned for processing 
images of three-dimenstonal objects with smoothly curved 
surfaces, The method is able to extract some information 
about the surface properties of the objects, such as the 
texture, specularity, and surface frregularity. 
Information about shape ts also extracted. The 
procedures are Insensitive to noise and distortion, and 
can be used to perform real recognition tasks. It is 
hoped that this work wil! provide a stepping-stone in the 


challenging study of computer vision. 
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Appendix: Description of Algorithms 


This appendix contains an outline of the 
algorithms used in the tree generating program, 

The image tree Is generated one threshold level 
ata time, starting at the highest level (branch tips). 
At each level, the tmage Is scanned, and the points above 
the threshold are marked in a scratch array. This 
scratch array is then scanned for marked points. When one 
ts found, a contiguity routine Is called, which visits 
all marked points which can be reached from the start via 
a connected path. The marks are erased by this routine 
as it goes, and statistics are kept on the region thus 
generated, such as the sums of the x and y coordinates of 
the points, and the sum of the squares of the x and y 
coordinates (used to compute the center of mass and the 
eccentricity). A tree node Is then made up for the 
region, and the scan for marked points continues. A 
special mark Is left In the scratch array for each 
region. When this mark Is encountered during the scan 
at the next level, {it is looked up on an assoclation 
list. This establishes the link between a region and 


the regions which are a subset of It at the previous 


level - i.e. between a node and its sub-nodes, 

The contiguity scan is the most complex program, 
it works by leaving directional pointers in the scratch 
array. These are three-bit codes denoting one of the 
eight possible neighboring points. The contiguity scan 
is always started at a point which is on the bottom edge 
of the region, It traces along this edge to the right by 
moving from one marked point to the next, but always 
keeping an un-marked point to the right side. As it 
goes, it erases the marks, so that for a region with 
smooth boundaries, it will follow a sptral path to the 
center, “eating up" the marks as It goes, like a lathe 
with the tool continually advancing Into the work. 

As the contiguity routine scans, it lays down 
back potnters fn the scratch array which enable {it to 
retrace its path back to the start. If a dead end is 
reached (no more marked neighbors), It traces back along 
this path, looking for marked points to the right. There 
can be no marked points on the left side while 
backtracking, since this was the riaht side on the way 
out, and the outgoing scan stayed as far to the right as 
possible. If a marked point Is found on the backtrace, 
it ts replaced with a pointer to the adjacent path 


already traced out, and then a new path ts traced as If 
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this were a new starting point. When the backtrace 
reaches the original starting point, the contiguity scan 
is completed, The effect of this alaorithm Is to 
construct a tree of pointers tn the scratch array, with 
the starting point at the root. All polnts which can be 
reached via a connected path from the starting point will | 
be a part of this tree, an example of which is shown In 
figure Al. 

An algorithm developed by S. Bryan [4] could 
speed the contiguity scan considerably. It entails 
coding the scratch array line by line as strips, as in 
figure A2. Each strip is specified by its y coordinate, 
and the x coordinates of its lteft and right end. The 
contiguity of these strips Is then checked, rather than 
operating on the individual points. This algorithm not 
only avoids scanning the entire scratch array, most of 
which its blank, but also requires fewer operations to 
find all of the contiguous pofnts, since they are 
gathered Into groups. It thus takes advantace of the 
fact that regions produced by real Images, as opposed to 
random noise, will tend to have the points clustered into 
bunches, 

A number of other programs were written in the 


course of this research, In order to make [{t convenfent 
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Figure Al: The Tree of Pointers Layed Down by the Contiguity 
Scan Algorithm 


(Shown for an arbitrary region) 


marked point, included in region 


N= pointer in direction of root 
(arrowhead not shown due to small size) 


Root of tree 


121 


Figure A2;: A Region Coded as Strips 


The same region is used as in figure Al 
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to study a large number of trees, programs were written 
to print out the trees on the line-printer, with the 
significant parameters associated with each node, 
Furthermore, a program was produced to plot any parameter 
ivs. threshold along any set of branches of the tree. 

This program was used to produce the graphs In this 
paper. 

Programs were also written to display an 
intensity modulated picture of the image, using the seven 
Intensity levels of a DEC 340 display. Since our 340 has 
no fast raster mode, a display compiler was written which 
generates a display tist in increment mode, allowing 
fairly large images to be shown virtually flicker-free. 
Other routines enable any arbitrary region in the image 
to be shown superimposed on this picture. The pointer 
method used [fn the contiguity scan was actually written 
for these display routines, which were developed first. 
The existence of this program made the writing of the 
contiguity scan very simple, which Is one reason why 
faster algorithms such as the Bryan algorithm were not 
sought. 

A large amount of code was required to back up 
the programs mentioned above, This [Includes a dynamic 


storage allocator for manipulating a large number of 
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arrays of changing size, display and plotter routines and 
other {/0 routines, routines for manipulating list 
structure, and routines which map arbitrary focal 
procedures over an array. The programs comprise over 
5200 words of PDP-!10 MIDAS assembly language code, not 
including about 1700 words of fixed buffer and tables, 
and not including the dynamically allocated array and 
list structure area, which can grow to an arbitrary size. 
Also used was the CNTOUR program [12], which 
draws intensity contour maps of an image, and which was 
written early in the course of this research, before the 


exact area of study had been decided upon. 
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